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ABSTRACT.  During  a portion  of  a test,  N gunners  fired  two  rounds  apiece. 

The  overall  proportion  of  hits  on  first  rounds  was  very  close  to  the  overall 
proportion  of  hits  on  second  round  shots.  However,  an  individual  gunner's 
performance  on  his  second  snot  was  positively  correlated  with  his  performance 

on  the  first  round. 


The  parameter  of  interest  was  p,  the  probability  of  hit  using  the  firing 
device.  The  proportion  of  hits  among  the  2N  shots  was  the  natural  point 
estimate  of  p.  However,  in  calculating  interval  estimates  for  p at  a given 
confidence  level,  or  tests  of  hypothesis  of  the  fora  p^pg  at  a given 
significance  level,  the  situation  became  more  subtle.  Since  the  first  round 
outcome  did  not  deterministically  predict  the  second  round  outcome,  we 
clearly  had  more  information  than  Just  the  N first  round  shots.  On  the 
other  hand,  the  assumption  that  we  had  2N  independent  trials  was  not 
justified. 


In  this  paper,  a model  is  proposed  for  the  analysis  of  this  and  Bimilar 
situations.  This  model  generalizes  the^two  round‘d  case  and  considers  data 


in  blocks  when  the  observations  within  blocks  are  not  independent. 


I.  INTRODUCTION.  During  a portion  of  the  test  of  a firing  device,  each 
gunner  fired  a volley  consisting  of  two  rounds.  The  outcome  of  each  round 
was  either  hit  (H)  or  miss  (M)  and  one  of  the  purposes  of  the  test  was  to 
draw  inferences  about  p,  the  probability  of  hit. 

The  following  table  depicts  a typical  segment  of  the  results: 
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Here,  the  overall  proportion  of  hits  on  a first  round  is  .6  and  the 
overall  proportion  of  hits  on  a second  round  is  also  .6.  The  probability  of 
hit  on  a first  round  appears  to  be  the  oame  as  the  probability  of  hit  on  a 
second  round,  so  the  overall  proportion  of  hits  is  an  unbiased  point 
estimate  of  p.  However,  the  conditional  probability  of  hit  on  a second 
round  after  having  scored  a hit  on  the  first  round  of  the  volley  is  5/6 
which  is  greater  than  .6.  In  other  words,  performance  on  the  second  round 
is  not  Independent  of  performance  on  the  first  round.  Suppose  n volleys 
were  fired.  We  do  not  have  2n  independent  rounds.  On  the  other  hand,  since 
the  outcome  on  the  first  round  did  not  predict  the  outcome  on  the  second 
round  deterministically,  we  have  more  information  than  just  the  n first 
round  shots.  The  problem  is  to  calculate  confidence  intervals  and  teats  of 
hypotheses  about  p that  reflect  our  true  amount  of  knowledge  realistically. 
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II.  THE  MODEL,  n players  are  selected  at  random.  The  probability  of  bit 
for  a player  comes  from  a distribution  with  mean  p and  unknown 
variance  o2  . Then  P ^ . . .,Pn>  the  players'  hit  probabilities, 

are  Independent  and  Identically  distributed  random  variables  with  mean  •>. 

The  l'th  player  fires  k^  shots,  k^  £ 1 , 1“1 , . . . , n.  The  data  Is 

<*  : i"l, . ,n, j “ 1 , . . ,k^)  where  X^j*l  If  the  l'th  player  scored  a bit 

on  the  J'th  trial  and  0 otherwise.  If  1 4 J then  X,  and  X.  are 

x r js 

Independent.  X.  and  X.  are  correlated  but  are  conditionally 
l r is 

independent  Bernoulli  variables  with  parameter  p.  given  {P.  • p.}. 
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III.  THE  TEST  STATISTIC.  Set  G,-  .E*  X,„  1-1,..  , n and  let 

1 J*1  1 J * ’ 

T "jSiCCj/k^/n.  Then,  using  the  law  of  conditional  expectation, 
E(G^)-EE(G JP. ) - E(k^P^)  - k^p  so  that  T is  an  unbiased  estimate 
of  p. 

««!>  * ill  ii+llr  VlrlV  ■ 

E(k1P1+k1(k1-l)P^)  - k1p+k1(k1-l)(p2+o2)  so  that 

Var(G1)  - k1(p-p2>  +o2  (1) 

If  we  set  A-  E 1/k^  then 

Var(T)“(A(p-p2)+o2(n-A) )/n2  (2) 

To  utilize  T as  a test  statistic,  It  is  necessary  to  estimate  Var  (T). 

The  following  lemma  is  easy  to  verify:  If  Y^,,..  Yr  are  independent 

2 n — 2 

with  a common  mean  and  Var  (Y^J-o^,  i-1,..,  n thenE^E^CY^-Y)  “ 
n 2 

(n-l)/n  E o*  Applying  the  lemma  with  Y^-G^/k^  an<*  usitHS  (l)t 
E 1|1  (C1/k1-T)2-((n-l)/n)(A(p-p2)+o2(n-A)).  (3) 
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Lotting  D“  £ (G^/k^-T)^,  it  follows  from  (2)  and  (3) 

that  U/(n(n-l))  Is  an  unbiased  e stlmH.e  of  V a r T.  The  statistic  that  Is 
proposed  Is,  then,  T/E  where  H“  *^/ n (n  - 1) . if  p[U<  x ) " 1 - o/2  for  U 
standard  normal  then  T-Ex^pjCT+Ex  is  an  approximate  1-a  confidence 
interval  lor  p.  Another  application  would  be  to  test  the  hypothesis  ll0: 
p£.9  vs.  Hj_:  p<  .9  using  the  rejection  criterion  (l'-.9)/E  _<  - x 
to  achieve  a significance  level  of  approximately  ct/2. 

IV.  A REFINEMENT’.  If  C,,...,C  are  any  real  numbers  such  that 
n 

jE^r^k^*'l  then  T*  = 1 ”l^i  an  unbiased  estimate  of  p.  The  choice 

of  C^*l/(nk^)  was  made  to  facilitate  estimating  the  variance  of  T*. 

This  corresponds  to  weighting  each  player  equally.  Another  possibility  would 
be  C£*l/N,  N“  Ek  , ie.  weighing  each  shot  equally.  Using  Lagrange 
multipliers  to  minimize  EC^  Var  subject  to  the  condition 

£ C^k^*l  yields  the  result  C^*K/(p-p^+o^(k.-l)  ) where  K is  a 

constant  of  proportionality. 

V.  A SIMULATION.  Since  normal  approxomation  was  used,  a simulation  was  run 
to  test  the  accuracy  of  this  method.  A situation  was  considered  in  which 
four  players  were  selected.  Their  probabilities  of  success  were  distrubuted 
uniformly  on  (.5,1)  so  that  the  overall  probability  of  success  was  .75.  Each 
player  fired  5 shots.  95X  confidence  intervals  were  constructed  using  both 
the  proposed  statistic  and  using  (4 )T+1. 96^1(1  - T)/N  i.e.  neglecting  the 
heterogeneity  of  the  players.  The  program  calculated  the  proportion  of  times 
the  confidence  interval  contained  .75,  the  true  value  of  p. 

For  three  runs,  the  results  were  .97,  .96  and  .97  for  the  proposed 
interval  and  .81,  .77  and  .78  using  (4). 
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APPENDIX  - SIMULATION  PROGRAM 


I 


5 X-0:Y-0 

10  DIM  P(4) , X(4 , 5) , G ( 4 ) 

15  CNT-0 
20  FOR  1-1  to  4 

30  P(I)-.5»RND(l)+.5 

40  FOR  J-l  to  5 

50  X(I,J)“0 

60  H-RND(l) 

70  IF  H <-P(I)  THEN  X (I,  J)  “1 

80  NEXT  J:  NEXT  I 
85  T-0 

90  FOR  I-  1 to  4 

100  G(I)-0 

110  FOR  J-l  to  5 

120  G(I)-C(I)+X(I,J)  : NEXT  J 

130  T-T+GU)  : NEXT  I 

140  T-T/20 
150  D-0 

160  FOR  1-1  to  4 : D-IH(G(I)/4-T)  a2 

170  NEXT  I 
180  E-SQR  (D/12) 

200  IF  ABS  (T-.75) <“1.96*E  THEN  X-X+l 

210  IF  ABS  (T-.75) <"1.96*SQR  (T*(l-T)/20)  THEN  Y-Y+l 

220  CNT-CNT+1 

230  IF  CNT  <500  THEN  20 

240  PRINT  "XBAR-" ; X/500;  "YBAR-" ; Y/ 500 

250  END 


